Online personalized recommendation services are generally hosted in the cloud where users query the cloud-based model to receive recommended input such as merchandise of interest or news feed. State-of-the-art recommendation models rely on sparse and dense features to represent users' profile information and the items they interact with. Although sparse features account for 99% of the total model size, there was not enough attention paid to the potential information leakage through sparse features. These sparse features are employed to track users' behavior, e.g., their click history, object interactions, etc., potentially carrying each user's private information. Sparse features are represented as learned embedding vectors that are stored in large tables, and personalized recommendation is performed by using a specific user's sparse feature to index through the tables. Even with recently-proposed methods that hides the computation happening in the cloud, an attacker in the cloud may be able to still track the access patterns to the embedding tables. This paper explores the private information that may be learned by tracking a recommendation model's sparse feature access patterns. We first characterize the types of attacks that can be carried out on sparse features in recommendation models in an untrusted cloud, followed by a demonstration of how each of these attacks leads to extracting users' private information or tracking users by their behavior over time.
translated by 谷歌翻译
联合学习(FL)旨在对多个数据所有者持有的分布式数据执行隐私的机器学习。为此,FL要求数据所有者在本地执行培训,并与中央服务器共享梯度更新(而不是私人输入),然后将其安全地汇总在多个数据所有者上。尽管汇总本身并不能证明提供隐私保护,但先前的工作表明,如果批处理大小足够大,则足够了。在本文中,我们提出了鸡尾酒会攻击(CPA),与先前的信念相反,能够从汇总的渐变中恢复私人输入,这是批量较大的大小。 CPA利用了至关重要的见解,即来自完全连接的层的总梯度是其输入的线性组合,这使我们将梯度反演作为盲源分离(BSS)问题(非正式地称为鸡尾酒会问题)。我们适应独立的组件分析(ICA) - BSS问题的经典解决方案 - 恢复针对完全连接和卷积网络的私人输入,并表明CPA明显优于先前的梯度反转攻击,对成像网的输入量表,并表现出Imagenet大小的输入的范围最高可达1024的大批量。
translated by 谷歌翻译
低功率边缘-AI功能对于支持元视野的设备扩展现实(XR)应用至关重要。在这项工作中,我们研究了两个代表性的XR工作负载:(i)手动检测和(ii)眼睛分割,用于硬件设计空间探索。对于这两种应用,我们都会训练深层神经网络,并分析量化和硬件特定瓶颈的影响。通过模拟,我们评估了CPU和两个收缩推理加速器实现。接下来,我们将这些硬件解决方案与先进的技术节点进行比较。评估了将最新的新兴非易失性记忆技术(STT/SOT/VGSOT MRAM)集成到XR-AI推论管道中的影响。我们发现,可以通过在7nm节点的设计中引入非挥发性记忆来实现手部检测(IPS = 40)和眼部分割(IPS = 6)的显着能源益处(IPS = 40)(IPS = 6)。 (推断每秒)。此外,由于MRAM与传统的SRAM相比,由于MRAM的较小形式,我们可以大大减少面积(> = 30%)。
translated by 谷歌翻译
本文探讨了超线性增长趋势的环境影响,从整体角度来看,跨越数据,算法和系统硬件。我们通过在行业规模机器学习用例中检查模型开发周期来表征AI计算的碳足迹,同时考虑系统硬件的生命周期。进一步迈出一步,我们捕获AI计算的操作和制造碳足迹,并为硬件 - 软件设计和尺度优化的结束分析以及如何帮助降低AI的整体碳足迹。根据行业经验和经验教训,我们分享关键挑战,并在AI的许多方面上绘制了重要的发展方向。我们希望本文提出的关键信息和见解能够激发社区以环保的方式推进AI领域。
translated by 谷歌翻译
Intelligently extracting and linking complex scientific information from unstructured text is a challenging endeavor particularly for those inexperienced with natural language processing. Here, we present a simple sequence-to-sequence approach to joint named entity recognition and relation extraction for complex hierarchical information in scientific text. The approach leverages a pre-trained large language model (LLM), GPT-3, that is fine-tuned on approximately 500 pairs of prompts (inputs) and completions (outputs). Information is extracted either from single sentences or across sentences in abstracts/passages, and the output can be returned as simple English sentences or a more structured format, such as a list of JSON objects. We demonstrate that LLMs trained in this way are capable of accurately extracting useful records of complex scientific knowledge for three representative tasks in materials chemistry: linking dopants with their host materials, cataloging metal-organic frameworks, and general chemistry/phase/morphology/application information extraction. This approach represents a simple, accessible, and highly-flexible route to obtaining large databases of structured knowledge extracted from unstructured text. An online demo is available at http://www.matscholar.com/info-extraction.
translated by 谷歌翻译
Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.
translated by 谷歌翻译
Artificial Intelligence (AI) is having a tremendous impact across most areas of science. Applications of AI in healthcare have the potential to improve our ability to detect, diagnose, prognose, and intervene on human disease. For AI models to be used clinically, they need to be made safe, reproducible and robust, and the underlying software framework must be aware of the particularities (e.g. geometry, physiology, physics) of medical data being processed. This work introduces MONAI, a freely available, community-supported, and consortium-led PyTorch-based framework for deep learning in healthcare. MONAI extends PyTorch to support medical data, with a particular focus on imaging, and provide purpose-specific AI model architectures, transformations and utilities that streamline the development and deployment of medical AI models. MONAI follows best practices for software-development, providing an easy-to-use, robust, well-documented, and well-tested software framework. MONAI preserves the simple, additive, and compositional approach of its underlying PyTorch libraries. MONAI is being used by and receiving contributions from research, clinical and industrial teams from around the world, who are pursuing applications spanning nearly every aspect of healthcare.
translated by 谷歌翻译
以前在外围防御游戏中的研究主要集中在完全可观察到的环境上,在该环境中,所有玩家都知道真正的玩家状态。但是,这对于实际实施而言是不现实的,因为捍卫者可能必须感知入侵者并估计其国家。在这项工作中,我们在照片真实的模拟器和现实世界中研究外围防御游戏,要求捍卫者从视力中估算入侵者状态。我们通过域随机化训练一个基于机器学习的系统,用于入侵者姿势检测,该系统汇总了多个视图,以减少状态估计错误并适应防御策略来解决此问题。我们新介绍性能指标来评估基于视觉的外围防御。通过广泛的实验,我们表明我们的方法改善了国家的估计,最终在两场比赛中的VS-1-Intruder游戏和2-Fefenders-VS-1-Intruder游戏中最终进行了外围防御性能。
translated by 谷歌翻译
基于注意力的神经网络在许多AI任务中都普遍存在。尽管其出色的算法性能,但注意力机制和前馈网络(FFN)的使用仍需要过多的计算和内存资源,这通常会损害其硬件性能。尽管已经引入了各种稀疏变体,但大多数方法仅着重于缓解算法级别上的二次注意力缩放,而无需明确考虑将其方法映射到真实硬件设计上的效率。此外,大多数努力仅专注于注意机制或FFN,但没有共同优化这两个部分,导致当前的大多数设计在处理不同的输入长度时缺乏可扩展性。本文从硬件角度系统地考虑了不同变体中的稀疏模式。在算法级别上,我们提出了Fabnet,这是一种适合硬件的变体,它采用统一的蝴蝶稀疏模式来近似关注机制和FFN。在硬件级别上,提出了一种新颖的适应性蝴蝶加速器,可以在运行时通过专用硬件控件配置,以使用单个统一的硬件引擎加速不同的蝴蝶层。在远程 - ARENA数据集上,FabNet达到了与香草变压器相同的精度,同时将计算量减少10到66次,参数数量为2至22次。通过共同优化算法和硬件,我们的基于FPGA的蝴蝶加速器在归一化到同一计算预算的最新加速器上达到了14.2至23.2倍的速度。与Raspberry Pi 4和Jetson Nano上优化的CPU和GPU设计相比,我们的系统在相同的功率预算下的最大273.8和15.1倍。
translated by 谷歌翻译
ICECUBE是一种用于检测1 GEV和1 PEV之间大气和天体中微子的光学传感器的立方公斤阵列,该阵列已部署1.45 km至2.45 km的南极的冰盖表面以下1.45 km至2.45 km。来自ICE探测器的事件的分类和重建在ICeCube数据分析中起着核心作用。重建和分类事件是一个挑战,这是由于探测器的几何形状,不均匀的散射和冰中光的吸收,并且低于100 GEV的光,每个事件产生的信号光子数量相对较少。为了应对这一挑战,可以将ICECUBE事件表示为点云图形,并将图形神经网络(GNN)作为分类和重建方法。 GNN能够将中微子事件与宇宙射线背景区分开,对不同的中微子事件类型进行分类,并重建沉积的能量,方向和相互作用顶点。基于仿真,我们提供了1-100 GEV能量范围的比较与当前ICECUBE分析中使用的当前最新最大似然技术,包括已知系统不确定性的影响。对于中微子事件分类,与当前的IceCube方法相比,GNN以固定的假阳性速率(FPR)提高了信号效率的18%。另外,GNN在固定信号效率下将FPR的降低超过8(低于半百分比)。对于能源,方向和相互作用顶点的重建,与当前最大似然技术相比,分辨率平均提高了13%-20%。当在GPU上运行时,GNN能够以几乎是2.7 kHz的中位数ICECUBE触发速率的速率处理ICECUBE事件,这打开了在在线搜索瞬态事件中使用低能量中微子的可能性。
translated by 谷歌翻译